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points in plural images obtained from plural image pickup 
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image pickup systems. The paired corresponding points, 
thus extracted, are synthesized, in a synthesis/conver- 
sion unit, into a panoramic image or a high definition im- 
age. Thus achieved are a reduction in the search time 
and an improvement in the precision of search. 
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D scription 

BACKGROUND OF THE INVENTION 
s Field of the Invention 

The present invention relates to a compound-eye image pickup device utilizing an imaging optical system including 
plural image sensor devices and plural lenses. 

10 Related Background Art 

For the purpose of generating a wide panoramic image or a high definition image, there is recently proposed a 
compound-eye image pickup device provided with plural image pick-up systems, each composed of an imaging optical 
system and an image sensor device and adapted to take the image of a common object, whereby a synthesized image 
is is generated from image signals obtained from the image sensor devices. 

For obtaining a panoramic image, there is known a method of simultaneously taking plural images in an object field 
with plural image pickup systems, then extracting a same object present in different images and connecting the images 
based on the relative positional information of said object in the images, thereby generating a synthesized panoramic 
image. 

20 Also for obtaining a high definition image, there is known a method of extracting a same object present in different 

images in a similar manner as in the panoramic image formation, and effecting interpolation based on the relative po- 
sitional information of said object in the images, thereby generating anew a high definition image. An image pickup 
device based on the above-mentioned principle is provided, as shown in Fig. 1, with a left-hand side image pickup 
system 10L and a right-hand side image pickup system 10R which are used to take the image of an object 11 , and a 

2B left image l L obtained by the left-hand side image pickup system 10L and a right image l R obtained by the right-hand 
side image pickup system 10R are subjected in an image processing unit 12, to extraction of corresponding points and 
synthesis, whereby an output image l out of a higher definition in comparison with the case of taking the object with a 
single image pickup system. 

However, the above-mentioned method of obtaining the synthesized panoramic image by extracting the same object 
30 present in different images and connecting the different images based on the relative positional information of the object 
in the images has been associated with a drawback of requiring a very long time in acquiring the relative positional 
information mentioned above. 

SUMMARY OF THE INVENTION 

35 

In consideration of the foregoing, a concern of the present invention is to provide a compound-eye image pickup 
device in which the range for searching said relative positional information is limited to a partial region of the image, 
based on the information on the arrangement of the image pickup device and the image taking parameters. 

A second concern of the present invention is to provide a compound-eye image pickup device in which the 
40 above-mentioned searching range is set, in consideration of the individual difference, for example in the image pickup 
parameters, of the plural image pickup systems, to a region not affected by the individual difference plus an additional 
region in consideration of an error resulting from the individual difference. 

According to a preferred embodiment of the present invention, there is provided a compound-eye image pickup 
device comprising plural image pickup systems, search means for searching mutually corresponding pair points from 
45 plural images obtained from the image pickup systems, and search range determination means for determining the 
range to be searched by said search means from image pickup parameters of the plural image pickup systems. 

Also according to a preferred embodiment of the present invention, said search range determination means is 
adapted to set the search range by selecting a range determined from the image pickup parameters as a basic range 
and adding a marginal range based on the individual difference of the plural image pickup systems. 
so a third concern of the present invention is, in effecting matching operation for synthesizing plural images, to enable 

determination of corresponding points in the entire area, where the corresponding points can exist, in a reference image, 
and to enable calculation of similarity in the entire area of the image to be searched, thereby improving the precision of 
extraction of corresponding points. 

Still other concerns of th pr s nt invention, and the features thereof, will become fully apparent from the following 
55 description., which is to be taken in conjunction with the attach d drawings. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a view of a conventional compound-eye image pickup device; 

Fig. 2 is a view showing the configuration of an embodiment of the compound-eye image pickup device of the 
present invention; 

Fig. 3 is a block diagram showing the configuration of an image processing unit shown in Fig. 2; 
Fig. 4 is a schematic view showing the configuration of the principal part in Fig. 2; 
Figs. 5A and 5B are views showing the mode of image taking; 

Fig. 6 is a schematic view showing the principle of projection of an object point P on sensors; 
Fig. 7 is a schematic view showing correction of convergence angle; 

Fig. 8 is a schematic view of an image pickup plane in a world coordinate system of the right-hand image pickup 
system; 

Figs. 9A and 9B are views showing a search range in a first embodiment of the present invention; 
Fig. 10 is a view showing a search range in a second embodiment of the present invention; 
Fig. 1 1 is a view showing the principle of a template matching method; 
Figs. 12 and 1 3 are views showing the drawback in the template matching method; 

Figs. 1 4A and 1 4B are views showing the principle of an improved template matching method of a third embodiment; 
Fig. 15 is a block diagram showing a system for extracting a moving object in the third embodiment; 
Fig. 16 is a block diagram showing a system for extracting a moving object in a fourth embodiment; 
Fig. 17 is a view showing the principle of epipolar transformation; 
Figs. 18A and 18B are views showing a search range in the fourth embodiment; and 
Fig. 1 9 is a view showing the principle of trigonometry. 
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

In the following there will be explained a first embodiment of the present invention with reference to the attached 
drawings, and at first explained is a part from the image pickup system to the generation of a synthesized image. 

The compound-eye image pickup device of the present embodiment is, as shown in Fig. 2, to obtain a synthesized 
panoramic image by parallel connection of two images, obtained by taking an object with a right-hand side image pickup 
system 10R and a left-hand side image pickup system 10L. 

At first there will be explained the left-hand side image pickup system 10L, which is composed of a phototaking lens 
group 1 1 L constituting an imaging optical system incorporated in an unrepresented lens barrel, a color-separation prism 
12L mounted on said phototaking lens group 11 L, for separating the light from the object into three primary colors, and 
three CCD image sensors 13L (only one being illustrated) provided respectively corresponding to the lights separated 
by the color-separation prism 12L and respectively having rectangular effective light-receiving areas. The phototaking 
lens group 11 L is composed of plural lenses including a focusing lens group 15L driven by a focusing motor 14L and a 
zooming lens group 17L driven by a zooming motor 16L, and said motors 14L, 16L are driven by control signals from a 
system control unit 21 and a focus/zoom control unit 22 in an operation control unit 20 for controlling the optical systems. 

A right-hand side image pickup system 10R is constructed similarly to the left-hand side image pickup system 10L,- 
and the optical axis L R of the phototaking lens group 11 R of said right-hand side image pickup system 10R and the 
optical axis L L of the phototaking lens group 1 1 L of the left-hand side image pickup system 1 0L lie on a same plane. 
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The lens barrels incorporating said phototaking lens groups 1 1 L 1 1 R are respectively connect d to the rotary shafts 
of convergence angle motors 1 8L, 1 8R driven by control signals from a convergence angle control unit 23 of the operation _ 
control unit 20. The rotary shafts of the convergence angle motors 18L, 18R extend perpendicularly to the plane con- 
taining the optical axes L L , L R of the phototaking lens group 11 L, 11R, and the activation of the convergence angle 
motors 18L, 18R respectively rotate the phototaking lens groups 11 L 11R integrally with the color-separation prisms 
12L, 12R and the CCD sensors 13L, 13R, whereby set is the mutual angle (convergence angle) of the optical axes L L , 
L R of the phototaking lens groups 11 L, 11 R. 

Also the image pickup systems 10L, 10R are respectively provide with focus encoders 24L, 24R for detecting the 
positions of the focusing lens groups 15L 15R, zoom encoders 25L, 25R for detecting the positions of the zoom lens 
groups 17L, 17R, and convergence angle encoders 26L, 26R for detecting the convergence angles. These encoders 
may be composed of externally added devices such as potentiometers or may be so constructed as to detect the re- 
spective positions or angles by signal information provided by the driving systems themselves such as stepping motors. 

To the CCD sensors 13L, 13R there is connected an image output unit 40 through an image processing unit 30, 
featuring the present invention. The image processing unit 30 is provided, as shown in Fig. 3, with an image input unit 
32 consisting of a left image memory 31 L and a right image memory 31 R for respectively storing the image(video) signals 
from the CCD sensors 13L, 13R (cf. Fig. 2) of the image pickup systems 10L, 10R, ah image conversion unit 38 for 
generating a synthesized image based on left and right images obtained from the video signals entered into the image 
input unit 32, and a synthesized image memory 39 for storing the image synthesized in the image conversion unit 38, 
for supply to the image output unit 40. 

The image conversion unit 38 consists of a corresponding point extraction unit 33 for extracting paired corresponding 
points in the images entered into the image input unit 32, and a synthesis conversion unit 41 for calculating the three-di- 
mensional position (distance information) of the paired corresponding points, based on the result of extraction thereof, 
and synthesizing an image utilizing said information. 

Fig. 4 illustrates the principal part of the optical systems of the compound-eye image pickup device in Fig. 2, seen 
from a direction perpendicular to the plane defined by the optical axes L L , L R of the phototaking lens groups 1 1 L, 1 1 R. 
For the simplification of the description, the color-separation prisms 12L, 12R (cf. Fig. 2) are omitted, and the CCD 
sensors 13L, 13R are illustrated in only one unit at each side. In the following there will be explained an example in 
which the focused planes mutually meet at the end points thereof, but such configuration is not essential in practice. As 
shown in Fig 4, the phototaking lens group 11 R and the CCD sensor 1 3R of the right-hand side image pickup system 
10R have a focused object plane 50R, and the image taking is limited by the effective light-receiving area of the CCD 
sensor 1 3R into a range between lines 51 R and 52R, so that an effective object field is defined on the focused object 
plane 50R, from a crossing point B R to a crossing point A with said lines 51 R, 52R. Also for the left-hand side image 
pickup system 1 0L, an effective object field is similarly defined on the focused object plane 50L, from the crossing point 
A to a crossing point Bl. 

The focusing motors 14L : 14R (cf. Fig. 2) and the zooming motors 16L : 16R (cf. Fig. 2) of the left and right-hand 
side image pickup systems 10L, 10R are so controlled that the distances between the focused object planes SOL, 50R 
and the CCD sensors 13L, 13R and the imaging magnifications are mutually same in the left- and right-hand sides. 

The motors, 14L, 14R, 16L, 16R, 18L : 18R are controlled by the operation control unit 20 (cf. Fig. 2) receiving the 
signals from the encoders 24L, 24R, 25L, 25R, 26L, 26R (cf. Fig. 3). In particular, the convergence angle motors 18L, 
1 8R are controlled in relation to the positions of the focused object planes 50L, 50R and the positions of the effective-ob- 
ject fields, calculated from the output signals of the focus encoders 24L 24R and the zoom encoders 25L, 25R. 

In the following there will be briefly explained the procedure of the synthesis. The corresponding point extraction 
unit 33 shown in Fig. 3 extracts paired corresponding points of the images. A representative method for such extraction 
is the template matching method. In this method there is conceived a template surrounding a point for example in the 
left image, and corresponding points are determined by the comparison of similarity in the right image, with respect to 
the image in said template. In the correlation method used for comparing the similarity, there is calculated the mutual 
correlation between the pixel values of the template image and those in the searched image and the corresponding 
point is determined at a coordinate where the mutual correlation becomes maximum, according to the following equation: 

a(m R , n R , m L , n L ) 


E R (m R -i, n R -j) © L (m L +i, n L +j ) 

(1) 


R 2 (m R -i, n R -j ) o^E L 2 (m L +i, n L +j ) 

wherein R(m R) n R ) and L(m L , n L ) stand for the pixel values of the right and left images, and 5(m R , n R , m L , n L ) indicates 
the level of corr lation. m R , n R , m L and n L indicates the coordinates of the pixels. In the summations of squares or 
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products, the sign in front of i, j is inverted in the right and left images because the coordinate axis is defined symmetrically 
to the right and to the left as shown in Fig. 5B. The normalized mutual correlation represented by the equation (1 ) has 
" a maximum value of unity. Another known method for this purpose is the DDSA method, which is also a kind of template 
matching method. !n this method, the remnant difference is calculated by: 

5 

a(m R , n K , m l , n L ) 

= E E | R (m R -i, n R -j ) - L(m L +i, n L +j ) | (2) 

10 in the course of calculation of summation, the calculation is interrupted when the remnant difference exceeds a prede- 
termined threshold value, and the calculation proceeds to a next combination of (m R , n R ) and (m L , n L ). The threshold 
value is generally selected as the minimum value of the remnant difference in the past. 

Based on the information on the corresponding points, there is determined the position of each pair of the corre- 
sponding points in the three-dimensional space, by a trigonometric method. 

is As shown in Fig. 6, the centers O l , O r of the object-side principal planes of the left and right phototaking lens groups 

11 L, 11 R (cf. Fig. 2) are positioned on the X-axis : symmetrically with respect to the Z-axis, and the length between said 
centers O l , O r is defined as the baseline length b. Thus said centers O l and O r are represented by coordinates (-b/2, 
0, 0) and (b/2, 0, 0). It should be noted that practically, the object is picked up by the optical system shown in Fig. 2, 
while it is assumed in Fig. 6 for convenience that the image pickup plain is at the position P L , P R in front of the lens 

20 optical system. When a point P in the three-dimensional space is projected toward the centers Ol, O r , there are obtained 
projection points P L , P R respectively on the left and right CCD sensors 1 3L, 1 3R. These points P, P L and P R are respec- 
tively represented by coordinates (X, Y, Z), (X L , Y L , Z L ) and (X R; Y R , Z R ). 

A plane defined by the three points P, P L , P R in the three-dimensional space is called an epipolar plane, and the 
crossing line of the epipolar plane and the sensor plane is called an epipolar line, tn this relation, the coordinate (X, Y, 

25 Z) of the point P can be given by the following equations (3), (4) and (5): 

V-/W9Y {Xl + (^)}/Z L+ {X fl -(b/2)}/Z fl 
X - (Oil) o + (b/2)VZL . {XR . {b j2)}/Z R 

Y y 

30 /= T R «Zz- l cZ (4) 

Z = {X L +(b/2)}/Z L + {X R + (b/2)} I Z R ^ (5) 
35 Also there stand following relations: 

Z R = {X R - (b/2) + f ° sin(6)} tan(6) + f ° cos(6) *(6) 

Z L = -{X L + (b/2) - f ° sin(6)} tan(9) + f * cos(6) (7) 

40 wherein 6 is an angle (convergence angle) of the optical axes I_l, L r of the left and right phototaking lens groups 11 L, 
11 R to lines respectively passing the centers O l , O r of the object-side principal planes and parallel to the Z-axis : and 
f is the focal length of the phototaking lens groups 1 1 L, 1 1 R Thus the coordinate (X, Y, Z) of the point P can be determined 
from the foregoing equations. The coordinate conversion is conducted, based on the above-mentioned coordinate, to 
obtain an image seen from a point, for example from the middle point of the two image pickup systems. 

45 In the following there will be explained a conversion of the images, taken with a convergence angle as shown in 

Fig. 7, into images without the convergence angle, namely as if taken in the parallel state, for the purpose of determination 
of the search range. It should be noted that practically, the object is picked up by the optical system shown in Fig. 2, 
while it is assumed in Fig. 6 for convenience that the image pickup plain is at the position P L , P R in front of the lens 
optical system. 

50 As shown in Fig. 8, three axes are represented by X, Y, Z; rotational motions about the three axes by A, B, C; 

translational motions by U, V, W; focal length by f; coordinate axes in an image pickup plane by x, g; and a point on the 
image pickup plane corresponding to the object point P(X, Y, Z) by p(x, y). It should be noted that practically, the object 
is picked up by the optical system shown in Fig. 2, while it is assumed in Fig. 6 for convenience that the image pickup 
plain is at the position P L , P R in front of the lens optical system. In this state there stand: 

55 x = f . (X7Z) (8) 

y = f . (Y/Z) (9) 
With the rotational and translational motions of the three axes, there stands: 
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f ■ 


0 cosA sinA 
\0 -sinA cos A} 


cosB 0 -sinB^ 

0 10 
K sinB 0 cosB j 


cosC sinC 0\ 
-sinC cosC 0 
0 0 1 


Jz) 



10 


15 


20 


25 


30 


35 


. . .(10) 

wherein X\ Y\ ? represent new three axes. Thus the point p(x\ y') on the image pickup plane corresponding to the point 
P(X\ Y\ Z') is represented by: 

x' =f-(X7Z') (11) 
y' = f.<Y7Z') (12) 
In this state, the optical flow (u, v) = (x\ y") - (x, y) is represented by: 


For simplilying the explanation, by considering B only (A = C= U = W = *)), the equation (1 0) can be transformed as: 


/ cosB 0 -sinB^ 

0 10 
^slnB 0 COSB > 


(14) 


(X cosB-Z s±nB\ 
\X sinB+Z cosB/ 
By substituting these into the equation (13), there are obtained: 


(15) 


(")-( 


(X 2 +f 2 ) sinB/ (f cosB+x sinB) 
yff (x sinB+f cosB) -y 


(16) 


40 


45 


= ( \/x z +f 2 o sinB/ cos(B-a) j (17) 
V -y+y ° cosa/ costs-a) ; 


wherein: 


a = tan 1 (x/f) = tan \x/z) (18) 
Considering B as the convergence angle, the above-explained conversion leads to the following conclusions, as 
indicated by the equation (17) and Fig. 7: 

(1) As the images are equivalent to those obtained in parallel image pickup by the camera, the projected object 
50 points are of a same height in the two image pickup planes; and 

(2) Hatched areas shown in Figs. 9A and 9B correspond to a portion where the image pickup is not conducted under 
the original image pickup conditions. 

55 Based on the Figs. 9A and 9B and the equation (17), a point (x 0 ', y 0 ') corresponding to (x 0 , y c ) is represented by: 
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Jx Q 2 +f 2 sinB/ cos(5-a 0 ) ^ 
y Q cosa/ cos(B-fl) \ 


(19) 


Thus, the width (c) of the above-mentioned hatched area is given by: 


iJxq 2 + f 2 sinB/ cos(B-a 0 )l, wherein a Q =tan~ 1 (x Q /f) 
In the following there is assumed: 


(20) 


G~ <Jx 2 + Z 2 ° sinB/ cos(B-a 0 ) 


From the foregoing conclusion (1), with respect to the vertical direction, the corresponding point for a point of a 
height W can be searched at the height w in the other image, as shown in Figs. 9 A and 9B, search range (a). 

With respect to the horizontal direction, there are considered the foregoing conclusion (2) and the following fact 

The corresponding point to a point (h, w) in the left image is present, in the right image, in a position up to (h, w). 
This will be understood from a fact that, if the object point P is at an infinite distance and is projected at a position (h, 
w) on the left image, it will also be projected at (h, w) on the right image. 

On the other hand, the hatched areas do not contain the taken images, but the points corresponding to the image 
from (O, W) to (G, W) in the right image should be present, if the corresponding image exists : in a portion of a same 
width in the left image. Consequently, a range (d) in the left image lacks the corresponding points. Thus a point (h, w) 
in the left image, if G < h, should have a corresponding point, in the right image, within a range from (G, W) to (h, w). 
There can therefore be set a basic region (e) indicating the range of the corresponding point, as shown in Figs. 9Aand 9B. 

In the following there will be explained the search range determined in consideration of the above-mentioned basic 
region and an error in the phototaking parameters in the compound-eye image pickup device. 

In the foregoing description of the compound-eye image pickup device, there has only been considered an angl 
B which is equal to 1 /2 of the convergence angle, but in practice the phototaking parameters of the image pickup systems 
contain angles A and C in the left-hand side image pickup system and errors AA and AC in the right-hand side image 
pickup. 

In such situation, a portion corresponding to the errors is added to the above-mentioned basic search region, by 
substituting these parameters into the equation (10) to derive (X', Y\ 71) anew and calculating (u, v) corresponding 
thereto. However, since such error portion is not known, a portion in consideration of the worst errors is usually added. 
For example a portion (i) shown in Fig. 9 A is added in the vertical direction, and : in the horizontal direction, a portion (f) 
°f 0) x G) as shown in Fig. 9B is added. 

In the loregoing first embodiment, the extraction of the corresponding point is conducted after correction for the 
convergence angle of the images. In the following there will be explained an embodiment without such correction of the 
convergence angle. 

Under the presence of a convergence angle, the epipolar lines are, in general, not mutually parallel. Consequently, 
in a second embodiment of the present invention, the search range in the vertical direction is selected as the entire 
vertical range of the image or about a half thereof. The search range in the horizontal direction is selected, according 
to the consideration explained in the foregoing, as a hatched area (g) in Fig. 10. 

In the foregoing embodiments, in searching the corresponding points in the images obtained from plural image 
pickup systems, the search is not conducted over the entire image but is limited in a portion thereof according to the 
phototaking parameters of the image pickup system, whereby the time required for extracting the corresponding point 
can be significantly reduced. Also such limitation of the search range eliminates extraction of erroneous corresponding 
point outside said search range, thereby improving the reliability. 

Also according to the foregoing embodiments, in case the plural image pickup system have individual fluctuation, 
the search range is determined by adding a marginal range corresponding to such individual fluctuation to the basic 
search range determined from the phototaking parameters, whereby the search time can be reduced even in the pres- 
ence of the errors, while maintaining the reliability of the search. 

In the following there will be explained a third embodiment of the present invention, providing a method for extracting 
corresponding points in plural images, for clarifying the correspondence between time-sequentially obtained plural im- 
ages or plural images obtained from plural image pickup systems, and an image processing unit therefor. 

For the ease of understanding, there will at first be explained the background of the present embodiment. The 
template matching method is known as a representative method for extraction of the corresponding points, for clarifying 
the correspondence among plural images. In this method, there is cone ived a templat surrounding a point, in a ref- 
erence image, for which the corresponding point is to be searched, and the corresponding point is determined by cal- 
culating the similarity betw n said template and a rang in the searched image. 
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Now reference is made to Fig. 11 for explaining the principle of the template matching method. As an example, in 
case of searching a point in a searched image 702, corresponding to a point Q on the right ear of the person .n a reference 
image 701 shown in Fig 11, a template 703 of a certain size around said point Q is prepared. This template 703 is 
moved in the searched image 702 with the calculation of similarity at each position, and the corresponding point to the 
point Q in the reference image 701 is determined at a position in the searched image 702 where the similarity is highest. 

The similarity can be calculated, for example, utilizing the difference in pixel values as shown by the equation (21) 
or the correlation of pixel values as shown by the equation (22): 


E lx>y) =EE lF UtJ) -A (HW ] 2 (21) 


D 


££ { F (i.i)° A U-x.i-y)} (22) 

V "* r ( /. i) V i-x, i-v) 

In these equations, F(i, j) indicates the searched image while A(i, j) indicates the template, and these equations 
provide the similarity when the template is at a position (x, y). In the calculation according to the equation (21) the 
corresponding point is given where E(x, y) becomes minimum, and the theoretical minimum of E(x, y) is zero. In the 
calculation according to the equation (22), the corresponding point is given where 5(x, y) becomes maximum, of which 

theoretical maximum is 1 . . 

In the above-explained method, however, since the template is to be prepared around the point (805 or 806 in Fig. 
12) for which the corresponding point is to be searched, the template 802 or 803 can only be prepared for the points 
present in a central area 804 of the reference image, and the extraction of the corresponding point cannot be achieved 
for the points present in the peripheral area of the reference image 801 . Similarly the similarity cannot be calculated in 
the entire searched image but only in the central area thereof. 

Consequently, in case the point A' corresponding to a point A in the reference image 901 is present in the peripheral 
area of the searched image 902 as shown in Fig. 1 3, the corresponding point is identified as not present or an erroneous 
corresponding point is extracted. . 

Consequently the object of the present embodiment is to provide a method for extracting corresponding points in 
plural images and an image processing device therefor, enabling extraction of the corresponding point for any po.nt in 
the entire reference image and also enabling calculation of similarity in the entire searched image, thereby improving 
the precision of extraction of the corresponding point. .__,,„. ,- 

The above-mentioned object can be attained, according to the present embodiment, by a method of extracting 
corresponding points among plural images, based on the template matching method, for clarifying the correspondence 
among said plural images, wherein, in extracting the corresponding points in first and second images, the area of the 
template is varied depending on the position of said template on said first image. 

The area of the template is varied in case it is limited by the first image area in the peripheral portion thereof. Also 
the area of calculation is varied according to the overlapping said varied template and the second image. The calculation 
area is lurther varied in case it is limited by the second image area in the peripheral portion thereof. Furthermore a 
moving object can be extracted from the corresponding points extracted in the above-explained method. Also in case 
said first and second images are simultaneously taken with different image pickup devices, the images are subjected 
to epipolar conversion prior to the extraction of the corresponding points. Also there can be calculated the distance to 
the object, from the corresponding points extracted in the above-explained method. 

Also the image processing device of the present embodiment, for clarifying the correspondence between plural 
images by the template matching method, comprises image input means for entering first and second images, and 
template varying means for varying the area of the template according to the position thereof on said first image. 

Said template varying means is provided with first area limiting means for limiting the area of said template to the 
area of said first image, in the peripheral portion thereof. It is further provided with calculation area varying means for 
varying the calculation area, based on the overlapping of said template varied in area and said second image. Said 
calculation area vaiying means is provided with second area limiting means for limiting said calculation area to the area 
of said second image in the peripheral portion thereof. It is further provided with moving object extraction means for 
extracting a moving object based on the extracted corresponding points. There is further provided epipolar conversion 
means for effecting epipolar conversion on the images prior to the extraction of the corresponding points, in case said 
first and second images are taken simultaneously with different image pickup devices. There is further provided distance 
calculation means for calculating the distance from the xtracted corresponding points to the object. 

The template matching method of the above-explained configuration enables to prepare th template, for searching 
the corresponding point, in the entire area of the reference image, and to calculate the similarity in the entire area of the 
searched image. 

The present embodiment will be clarified in further details with reference to the attached drawings. 
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Fig. 1 5 illustrates an exampl of the system relating to the extraction of corresponding points in the image processing 
device of the third embodiment. 

There are provided a camera 201 constituting an image pickup device: a memory 202 for storing the image obtained 
by the camera 201 ; a corresponding point extraction unit 203 for extracting corresponding points in the image stored in 
the memory 202 and an image currently obtained by the camera 201 ; and a moving object extraction unit 204 for ex- 
tracting a moving object, based on moving vectors of the pixels, obtained by the corresponding point extraction unit 203. 
This system is used for precisely extracting a moving object from the image taken by the camera 201 and displaying 
the moving object only or cutting out the area of the moving object for the purpose of moving image compression. 

The above-explained system functions in the following manner. The image entered from the camera 201 is supplied 
to the memory 202 and the corresponding point extraction unit 203. The memory 202 has a capacity of plural images, 
in order that the currently entered image is not overwritten on the previously entered image. The corresponding point 
extraction unit 203 effects extraction of the corresponding points in the entire area of the entered image and an imme- 
diately preceding image from the memory 202, as will be explained later. Thus the corresponding point extraction unit 

203 determines movement vectors based on the immediately preceding input image. The moving object extraction unit 

204 classifies the movement vectors of the pixels of the reference image, obtained in the corresponding point extraction 
unit 203, according to the direction and magnitude of the vectors, thus dividing the areas and extracts an area of the 
moving vectors, different from those of the background, as a moving object. 

In the following there will be explained the extraction of the corresponding points in the entire reference image, 
executed in the corresponding point extraction unit 203. Figs. 14A and 14B illustrate the preparation of the template in 
the reference image and the movement of the template in the searched image. 

In Fig. 1 4A, hatched areas indicate templates 1 04 - 1 06 corresponding to points 1 01 - 1 03 for which the correspond- 
ing points are to be searched. If the image in Fig. 14A is taken as the reference image 110 from the memory 202, the 
template is prepared in the conventional manner in the central area of the reference image 110 (template 105). In the 
periportal of the reference image 1 10, an area of the same size as the template 105 in the central portion of the reference 
image 110 is considered about the point 101 or 103, and an overlapping portion of said area and the reference image 
1 1 0 is defined as the template 1 04 or 1 06 (hatched areas in Fig. 1 4 A). 

For example, if the template of a point in the central area of the reference image 110 has a size of 7 x 7 pixels, th 
template 104 for the point 101 in Fig. 14A has a size of 4 x 4 pixels. Thus the template for a point, for which the 
corresponding point is to be searched, in the peripheral portion of the reference image 1 10 is different in shape and size 
in comparison with the temperature for the point in the central area, and the position of the point for which the corre- 
sponding point is to be searched is displaced from the center of the template. 

Now reference is made to Fig. 14B for explaining the method of calculating the similarity, with the above-explained 
template, in a searched image 120 entered from the camera 201. As an example, the template 106 in the reference 
image 110 shown in Fig. 1 4A is used. 

The point 103 for which the corresponding point is to be search is placed in succession on the points TTl - 113 in 
the search image 120 shown in Fig. 14B, whereupon overlapping between the template 106 and the searched image 
1 20 takes place in grid-patterned areas in Fig. 1 4B. The similarity is calculated according to the foregoing equation (21 ) 
or (22), utilizing the pixel values in said grid-patterned areas. 

If the grid-patterned area has a horizontal length h r and a vertical length v r in Fig. 14B, summation in the equation 
(21 ) or (22) is taken in a range of h r in the horizontal direction and v r in the vertical direction. 

Summarizing the foregoing consideration on sizes there stand following relations, wherein h and v are maximum 
sizes of the template in the horizontal and vertical directions, and h m and v m are sizes of the template prepared from 
the reference image 110, in the horizontal and vertical directions: 

h ± h m * h r 

In case of using the equation (21), utilizing the sum of the remnant differences, in the calculation of similarity, a 
higher precision in the determination of the corresponding point can be achieved by employing, instead of E(x, y) for 
each point, the remnant different per time E'(x, y) = E(x, y)/C obtained by dividing E(x, y) with the number C of calculations 
used therefor. 

As explained in the foregoing, a moving object can be extracted with satisfactory precision in a system provided, 
as shown in Fig. 15, with the corresponding point extraction unit 203 enabling to prepare the template in the entire 
r ference image and to move th t mplate in the entire area of the searched image. Also in contrast to the conventional 
moving image compression method in which the image is divided into a certain number of blocks and the moving area 
is extracted from such blocks, the method of the present embodiment nables precise extraction of the moving area in 
the unit of each pixel, th reby achieving an improvement in the compression rate and an improvement in the resolving 
power when the image in expanded. 


9 


EP 0 696 144 A2 


10 


15 


20 


25 


The foregoing embodiment has been explained with an image taken with a camera, but this is not essential and a 
similar effect can also be obtained for example with an image obtained from a CD-ROM. 

Fig. 1 6 shows an example of the system for extracting the corresponding points in a fourth embodiment of the image 
processing device, adapted for obtaining the distance distribution of the object based on images obtained from plural 
cameras. 

There are provided a right-hand side camera 301 constituting an image pickup device; a left-hand side camera 302 
constituting an image pickup device; a right-hand side epipolar conversion unit 303 for converting the image, obtained 
with a convergence angle by the right-hand side camera 301 , into a state without convergence angle; a left-hand side 
epipolar conversion unit 304 of a similar function; a corresponding point extraction unit 305 for extracting the corre- 
sponding points of the images obtained by the right- and lefthand side epipolar conversion units 303, 304; a distance 
measurement unit 307 for calculating, by trigonometric principle, the distance distribution of the object based on the 
corresponding points obtained from the corresponding point extraction unit 305; and a synchronization circuit 307 for 
synchronizing the timing of phototaking of the cameras 301 , 302. 

The above-explained system functions in the following manner. 

Under synchronization by the synchronization circuit 307, the right- and left-hand side cameras 301, 302 simulta- 
neously provide a right image 308 and a left image 309, which are subjected to epipolar conversion, into a state without 
convergence angle, respectively by the epipolar conversion units 303, 304. 

This epipolar conversion will be explained in the following. 

As shown in Fig. 17, three axes are represented by X, Y, Z; rotational motions about the three axes by A, B, C; 
transnational motions by U, V, W; focal length by f ; coordinate axes in an image pickup plane by x, y; and a point on the 
image pickup plane corresponding to the object point P(X, Y, Z) by p(x, y). It should be noted that practically, the object 
is picked up by the optical system shown in Fig. 2, while it is assumed in Fig. 6 for convenience that the image pickup 
plain is at the position P L , P R in front of the lens optical system. In this state there stand: 

x = f X X/z (23) 
y = lXY/z (24) 
With the rotational and transnational motions of the three axes, there stands: 


30 


X 1 
Y % 
Z* 


10 0 
0 casA sinA 
0 -sinA cosA 


cosB 0 -sinB 

0 10 
sinB 0 cosB 


35 


40 


45 


cosC sinC 0 
-sinC cosC 0 
0 0 1 







X 


u 


Y 


V 


Z 


w 




(25) 


wherein X', Y\ Z' represent new three axes. 

Thus the point p(x\ y') on the image pickup plane corresponding to the point P(X\ Y", Z) is represented by: 

x' = f x X7z 
y' = f x Y7z 

In this state, the optical flow (u, v) = (x\ y*) (x, y) is represented by: 


(26) 
(27) 


m-i£]-[5M-'[? : £-#i] 

For simplifying the explanation, by considering B only (A - C = U = V = W = the equation (25) can be transformed 


as: 


so 


X* 

y 
z % 


cosB 0 -sinB 

0 10 
sinB 0 cosB 


X 
Y 
Z 


(29) 


55 


Xcos B-Zsin B 
Y 

Xsin B+Zcos B 

By substituting these into the equation (28), there are obtained: 


(30) 


10 


EP 0 696 144 A2 


[ul = \-(x 2 + f 2 ) sin B/ (faos B+ xsin B) 1 / 31 \ 
M I y f/ (xsin B + jfcos B) -y 1 v ' 


to 


-/?c 2 + °sin B/cos (B-cc) 
-y + y o cos a/cos (B-a) 


(32) 


wherein: 

a = tan" 1 (x/f) = tan" 1 (X/Z) 

75 By considering the rotation B as the convergence angle, the epipolar conversion can be achieved by the above-writ- 

ten equations. Said convergence angle can be measured, for example by an encoder in the convergence angle control 
unit, though it is not illustrated in Fig. 17. 

The corresponding points are extracted from the right-hand side epipolar image 311 and the left-hand side epipolar 
image 312, obtained after the above-mentioned conversion, in the corresponding point extraction unit 305. With the 
20 left-hand side epipolar image 31 2 as the reference image, the template can be prepared in the entire area of the reference 
image as in the foregoing embodiment. However the movable range of the template within the searched right-hand side 
epipolar image 311 is different from that in the third embodiment. 

Now there will be given an explanation on the movable range. Owing to the epipolar conversion of the left and right 
images in the right-hand side epipolar conversion unit 303 and the left-hand side epipolar conversion unit 304, the 
2S obtained images are equivalent to those taken in a mutually parallel state. Consequently, in the vertical direction of the 
images, the corresponding points are present in a same height in the images. Consequently, for extracting .the corre- 
sponding point of a point 504 in the reference image 501 , the template 503 only needs to be moved in a single row as 
shown in Figs. 18A and 18B, and the calculation area in the searched image 502 varies as 505 - 507. 

It is also effective to effect the search in several rows, instead of a single row, in consideration of an error in the 
30 calculation or in the reading of the convergence angle. 

Based on the result of extraction of the corresponding points in the entire area by the corresponding point extraction 
unit 305, the distance measurement unit 306 calculates the distance distribution of the object by the trigonometric method 
explained in the following. 

As shown in Fig. 19, centers O l , O r of the object-side principal planes of the left and right phototaking lens groups 
35 301 , 302 are positioned on the X-axis, symmetrically with respect to the Z-axis, and the length between said centers 
O l , O r is defined as the baseline b, whereby the corinates of the centers O l , O r are represented respectively by (-b/2, 
0, 0) and (b/2, 0, 0). It should be noted that practically, the object is picked up by the optical system shown in Fig. 2, 
while it is assumed in Fig. 6 for convenience that the image pickup plain is at the position P L , P R in front of the lens 
optical system. 

40 When a point P in the three-dimensional space is projected toward the centers O l , O r , there are obtained projection 

points P L , P R respectively on the left and right CCD sensors A SL , A SR , wherein the points P, P L , P R are respectively • 
represented by coordinates (X P , Y P , Z P ), (X PL , Y PL , Z PL ), (X PR> Y PR , Z PR ). The object is to determine the point P(X P , 
Y P , Z P ) The values (X PL , Y PL ) and (X PR , Y PR ) are obtained from the corresponding point extraction unit 305, and Z PL 
= Z PR = f wherein f is the focal length of the lens in case the phototaking is conducted with parallel optical axes. The 

45 distance distribution can be obtained by substituting these known value into the following three equations: 

A P ~ 2 * (X pL + b/2) / Z pL + (X pR -b/2) /Z pR K ' 

so y p =^«Z=^Z (34) 

^PR ^PL 

Zp = (X PL + b/2) I Z pL b (X pR - b/2) I Z pR (35) 

55 As explained in the foregoing, the system utilizing the corresponding point extraction unit 305 capable of extracting 

the points corresponding to all the pixels in the reference image, as shown in Fig 16, can provide a smooth distance 
distribution in the pixel level, instead of the distribution in the unit of the image block. 

In the fourth embodiment, the images are subjected to epipolar conversion, but it is also possible to effect extraction 
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of the corresponding points by moving the template in the entire area of the search d image without such epipolar 
conversion : as in the first embodiment. 

In either case, the corresponding points can be extracted for all the pixels in the reference image. 

The present invention is applicable either to a system composed of plural equipment, or to an apparatus consisting 
5 of a single equipment. It is naturally applicable also to a case where the present invention is achieved by the supply of 
a program to a system or an apparatus. 

As explained in the loregoing embodiments, in extracting the corresponding points among plural images, the present 
invention enables to determine the corresponding points for the entire area, where the corresponding points are present, 
in the reference image, whereby the corresponding points are obtained in a larger number and at a higher density than 
10 in the conventional method and the precision of extraction of the corresponding points can be improved. 


30 4. 


35 


Claims 

is 1 . A compound-eye image pickup device comprising: 
plural image pickup systems; 

search means for searching paired corresponding points in plural images obtained from said plural image 
pickup systems; and 

search range determination means for determining the range to be searched by said search means, according 
20 to phototaking parameters ol said plural image pickup systems. 

2. A compound-eye image pickup device according to claim 1, wherein said search range determination means is 
adapted to determine the search range by adding, to a basic search range determined according to said phototaking 
parameters, a marginal range based on the individual difference of said plural image pickup systems. 

25 

3. A compound-eye image pickup device according to claim 1 , wherein said search means is based on a template 
matching method in which a template of an image is moved in another image and a corresponding point is extracted 
at a position where the correlation becomes maximum. 

A compound-eye image pickup device according to claim 1 , wherein said phototaking parameters are information 
on the convergence angle. 

5. A compound-eye image pickup device according to claim 4, wherein said phototaking parameters are information 
on the focal length. . 

6. A compound-eye image pickup device according to claim 1 , wherein said search range determination means Is 
adapted to determine the search range by converting images, taken in the presence of a convergence angle, into 
images taken without such convergence angle. 

40 7. An image pickup device comprising: 
plural image pickup means; 

memory means for respectively storing image information taken by said plural image pickup means; 
means for determining the convergence angle of said image pickup means; 

area determination means for determining, based on said convergence angle : a range for searching corre- 
45 sponding points in the plural images stored in said memory means; and 

correction means for correcting the range, determined by said area determination means, based on error 
information in the phototaking parameters of said image pickup means. 

8. An image pickup device according to claim 7, wherein said search is based on a template matching method in which 
so a template of an image is moved in another image and a corresponding point is extracted at a position where the 

correlation becomes maximum. 

9. An image pickup device according to claim 8, wherein said area determination means is adapted to determine the 
search area by conv rting images, taken in the pres nee of a convergence angle, into images taken without such 

55 convergence angle. 

10. A corresponding point extracting method for clarifying the correspondence between plural images by a template 
matching method, comprising steps of: varying the area of the template according to the position thereof on said 
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first image in extracting corresponding points of first and second images. 

1 1. A method according to claim 1 0, wherein the variation of said template area is conducted in such a manner that the 
template area is limited to the area of said first image in a peripheral portion thereof. 

12. A method according to claim 10, wherein the area of calculation of correlation is further varied according to the 
overlapping of said template varied in area and said second image. 

13. A method according to claim 12, wherein the variation of said area of calculation is conducted in such a manner 
that said area of calculation is limited to the area of said second image in a peripheral portion thereof. 

1 4. A method according to claim 1 0 or 1 2, wherein images are subjected to epipolar conversion prior to the extraction 
of the corresponding points, in case said first and second images are simultaneously taken with different image 
pickup devices. 

15. A distance measurement method, wherein the distance from the extracted corresponding points to the object is 
calculated by the method defined in claim 14. 

16. A moving object extracting method, wherein a moving object is extracted from the corresponding points extracted 
by the method defined in claim 10 or 12. 

17. An image processing device for clarifying the correspondence between plural images by a template matching 
method, comprising: 

image input means for entering first and second images; and 

template varying means for varying the area of the template according to the position thereof on said first 
image. 

18. An image processing device according to claim 17, wherein said template varying means includes first area limiting 
means for limiting the area of said template to the area of said first image in a peripheral portion thereof. 

19. An image processing device according to claim 17, further comprising: 

calculation area varying means for varying the area of calculation based on the overlapping of said template 
varied in area and said second image. 

20. An image processing device according to claim 19, wherein said calculation area varying means includes second 
area limiting means for limiting said area of calculation to the area of said second image in a peripheral portion 
thereof. 

21. An image processing device according to claim 17, further comprising: 

moving object extraction means for extracting a moving object based on the extracted corresponding points. 

22. An image processing device according to claim 17, further comprising: 

epipolar conversion means for effecting epipolar conversion on said first and second images prior to the extrac- 
tion of the corresponding points, in case said images are taken simultaneously with different image pickup devices. 

23. An image processing device according to claim 22, further comprising: 

distance calculation means for calculating the distance from the extracted corresponding points to the object. 

24. An image processing apparatus comprising a pair of image pickup systems, and means for generating either a 
panoramic or a high definition image by synthesising the outputs of the two systems. 
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